Planning as an Iterative Process^ 


David E. Smith 

NASA Ames Research Center 
Moffet Field, CA 94035-0001 
david. smith @ nasa.gov 


Abstract 

Activity planning for missions such as the Mars Ex- 
ploration Rover mission presents many technical chal- 
lenges, including oversubscription, consideration of 
time, concurrency, resources, preferences, and uncer- 
tainty. These challenges have all been addressed by the 
research community to varying degrees, but significant 
technical hurdles still remain. In addition, the integra- 
tion of these capabilities into a single planning engine 
remains largely unaddressed. However, I argue that 
there is a deeper set of issues that needs to be consid- 
ered - namely the integration of planning into an itera- 
tive process that begins before the goals, objectives, and 
preferences are fully defined. This introduces a num- 
ber of technical challenges for planning, including the 
ability to more naturally specify and utilize constraints 
on the planning process, the ability to generate multiple 
qualitatively different plans, and the ability to provide 
deep explanation of plans. 


Introduction 

Often, planning systems are regarded as simple isolated 
components that accept a set of goals, a set of initial con- 
ditions, and a description of the possible actions that can be 
performed, as illustrated in Figure 1 . The output is a plan - 
a program of actions - that can be executed to achieve the 
goals. Increasingly, planning systems are being applied to 
real world problems where responsiveness may be impor- 
tant, where replanning is the norm, and where the planning 
system must interface with humans. When humans actively 
take part in the decision making and planning process, the 
process is often referred to as mixed-initiative planning. 

Much of the work on mixed-initiative planning has fo- 
cused on low-level guidance of the planning process - al- 
lowing the user to choose which goals or subgoals are con- 
sidered next, to choose which actions should be used to 
achieve goals or subgoals, to choose when goals or sub- 
goals are achieved, and to choose where actions should be 
placed in a plan. In addition, mixed initiative systems of- 
ten allow the user to edit a partial or completed plan by 
moving actions around, locking actions down, or removing 
actions. The MAPGEN planning system, which was used 
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to do daily planning for the two Mars Exploration Rovers 
(MER), follows this model (Bresina et al. 2005). Human 
Tactical Activity Planners (TAPs) used MAPGEN in an in- 
teractive mode where they would select and place activities 
on timelines, and MAPGEN would instantiate details and 
enforce constraints. The TAP could also remove and reorder 
activities, and MAPGEN would identify and flag any vio- 
lated constraints. This approach was quite successful, and 
has led to similar follow-on systems being adopted for the 
Phoenix Mars Lander and the recently launched Mars Sci- 
ence Laboratory. However, this approach addresses only a 
small part of the planning problem, and does not take full ad- 
vantage of the power of automated planning, as many of us 
in AI would like. There are a number of technical difficulties 
that stand in the way of a more fully automated approach to 
planning for such missions, and I discuss some of these in 
the next section. However, there is a bigger issue with the 
planning process - it is still being considered as a separate, 
isolated component that is used after the scientific team has 
fully specified their goals and preferences. It is this issue 
which I would like to bring to the fore - integrating plan- 
ning into an iterative process where the goals, objectives, 
and preferences are only partially understood. 


Initial 

Conditions 


Goals 


w 

Planner 

~T~ 


Action 

Descriptions 


Plan 


Figure 1 : The traditional view of planning as an independent com- 
ponent. 


MER Planning 

Figure 2 shows a meeting of the Science Operations Work- 
ing Group (SOWG) for one of the MER rovers. During the 




first year of MER operations, a meeting like this took place 
each Martian night, in order to decide on the science goals 
and activities for the next day. There are a number of dif- 
ferent scientists in the room, including planetary geologists, 
atmospheric scientists, and biologists. In addition, there 
are many engineers in the room with specialized knowledge 
of particular instruments, rover mobility and driving, arm 
manipulation and placement, thermal characteristics of the 
rover, the power system, communication systems, and var- 
ious software systems. As with any group this large you 
would not expect there to be complete agreement about the 
goals for the next day. Different scientists have different 
places they want the rover to go and different measurements 
they want it to take. These measurements are not entirely in- 
dependent; a scientist may want multiple different measure- 
ments of a specific rock, or might want atmospheric mea- 
surements at regular intervals. There are time constraints 
and preferences as well, due to lighting and temperature con- 
siderations. For example, when taking a visible image of a 
location, good illumination is important. However, when 
using an infrared spectrometer, the instrument needs to be 
cold and dark. There are many additional constraints on 
resources, such as energy and power available throughout 
the day, data storage available, and available communication 
windows. 



Figure 2: Science Operations Working Group (SOWG) meeting 
for one of the MER rovers. 

What we’d like to think is that the scientists would pro- 
duce a nice clean set of goals, which, together with the ob- 
jective criteria and current rover state (initial conditions), 
could be fed into a planning engine. We could then turn the 
crank and get out a detailed plan (like that shown in Figure 
3) that could then be uplinked to the rover. Unfortunately, 
this is not a simple STRIPS -style planning problem, for a 
number of reasons: 

Oversubscription This is an oversubscription planning 
problem, which means that not all the goals can be ac- 
complished, given the resources available. With a di- 
verse group of scientists, it is no surprise that they want 
to achieve more than is possible, given the time, energy 
and data storage available. As is typical in such problems 
there are different values to different goals, and combina- 


tions of goals, so that it is not an easy matter to identify an 
optimal subset of goals to pursue, even if the goals were 
independent of each other. 

Temporal Planning This is a temporal planning problem, 
which means that different actions have different dura- 
tions, and concurrent actions are necessary. There are also 
many time constraints on various activities, due to illu- 
mination constraints, temperature constraints, solar power 
availability, atmospheric conditions, and communication 
windows. 

Resources There are discrete and continuous resources , 
such as state of battery charge and data storage, that are 
temporarily used, consumed, or produced by different ac- 
tivities. 

Preferences There are preferences involved - scientists 
may have preferences for one objective over another, but 
they may also have preferences for time windows, or for 
the order in which experiments are done. 

Uncertainty There is uncertainty about the initial state of 
the rover (battery charge, pose, terrain map), about the 
exogenous events (atmospheric conditions, dust devils, 
communication bandwidth, solar radiation), and about the 
outcomes of actions (pose, energy usage, time taken). 

All of these issues have received attention over the last ten 
years and considerable progress has been made. However, 
there are still some significant shortcomings to this work. 
For oversubscription planning, work has largely been lim- 
ited to the special case of net-benefit planning, where ac- 
tions are augmented with costs, goals are augmented with 
rewards, and the optimal plan is defined as the one with 
the greatest sum of rewards less action costs (Benton, Do, 
and Kambhampati 2009). What is missing is work on the 
tougher problem of oversubscription planning where actions 
consume resources and there are limits on the resources 
available (energy, data storage). For this class of problems, 
action “costs” (resource usage) are not directly comparable 
to goal rewards. 

For temporal planning, relatively little attention has been 
paid to problems with large numbers of exogenous events 
and time constraints. For this kind of problem, it is not at all 
clear that forward state- spaced search with any of the cur- 
rently popular search heuristics will be very effective. For 
preferences, there has been little work on dealing with time 
preferences , namely preferences on the order in which cer- 
tain activities are performed, or the time windows in which 
activities are performed. 1 

For planning under uncertainty, most work has been lim- 
ited to consideration of instantaneous actions, and sequential 
plans. Dealing with concurrent actions that have uncertain 
duration, or have uncertain use of continuous resources such 
as energy, is particularly problematic. In a state- space ap- 
proach, one is forced to encode time in the state space, and 
consider combinations of actions starting at different times. 
If actions have uncertain durations, the branching factor is 

'it is generally rather awkward to express many of these pref- 
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large, and the search space quickly becomes intractable. 
There have been some notable attempts at addressing these 
problems (e.g. Younes and Simmons 2004, Aberdeen and 
Buffet 2007, Mausam and Weld 2008, Meuleau et al. 2009), 
but, so far, these techniques make many simplifying assump- 
tions, and are far from achieving the kind of performance 
typical of state-of-the-art classical planning systems. 



Figure 3: A rover plan. Note the prevalence of concurrency and 
the range of activity durations. 

Apart from these individual issues, there has been little at- 
tempt to integrate all of the above capabilities together. Each 
one of these problems is hard enough, and the techniques de- 
veloped so far are not exactly plug-and-play. 

The Broader Problem 

Although the above issues are a significant barrier to produc- 
ing a system capable of tackling the MER planning problem, 
this alone is not enough. The broader problem is that ini- 
tially the scientists don’t have a clear idea of what is achiev- 
able, of what their goals should be, of the relative value of 
different goals, or of their preferences. Through a process of 
proposing and examining different options they eventually 
arrive at a set of primary and secondary goals that they can 
hand off to a TAP to produce a detailed plan like that shown 
in Figure 3. If they could use a planner much earlier in the 
process it would help them to develop their goals and prefer- 
ences. In effect, it would allow them to perform trade stud- 
ies to examine the space of possible goals, preferences, and 
plans. This makes the planning part of an iterative process, 
like that shown in Figure 4, which has several interesting 
implications: 

Plan Constraints As the process goes on, the scientists in- 
creasingly place constraints on the nature of the plan. For 
example, from an initial run, they might decide they like 
particular choices or activities, and want to keep them, 
while abandoning others. Thus, they might want to say 
something like: 
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Figure 4: Rover activity planning viewed as an iterative process 
of plan revision under constraints. 


Keep activities A, B, and C, but make sure you do A 
before 4pm, and B before C. 


or 

Don ’t do both D and E unless there is extra energy 

available after doing all the other primary goals. 

While it may be possible to enforce such constraints by 
using PDDL 3.0 preferences (Gerevini et al. 2009) or 
cleverly modifying operator descriptions, it is awkward 
to do so, and it is unclear whether existing planners are 
able to efficiently cope with such constraints. 

Multiple Plans In the early stages of the process, the scien- 
tists have not yet completely settled on or elucidated their 
preferences or their optimization criteria. As a result, it 
is not clear what the best plan is. The planner needs to 
be able to return multiple solutions, that are “qualitatively 
different” and somehow reflect the range of possible pref- 
erences or optimization criteria that might be considered 
by the scientists. 

Plan Explanation These plans are complicated and the sci- 
entists need to be able to ask questions and get back useful 
answers. Some questions, like: 

Why is activity A in the plan ? 

would seen to be relatively easy to answer, but others such 
as: 

Why is action A done before action B ? 

What would happen if I delayed action A until 4pm? 

Why wasn’t goal G chosen for inclusion in the plan? 

Why didn’t you satisfy preference P? 

are much tougher to answer because they require deeper 
analysis of the relationship between actions in the plan, 
are hypothetical in nature, or are negative questions ask- 
ing why something wasn’t done differently. 





Addressing the above three issues would allow us to em- 
bed a planning system into an iterative process like that en- 
visioned in Figure 4, and use it to perform trade studies 
that can help the scientists refine and elucidate their goals, 
objectives, and preferences. One could regard this overall 
process as being preference elicitation (Chen and Pu 2004; 
Brafman and Domshlak 2009). We are in fact, trying to help 
the scientists converge on the right “product”, in this case a 
plan to achieve their goals. As with many other instances of 
preference elicitation, direct questioning of the scientists to 
try to elicit their preferences would likely be annoying and 
would converge very slowly. In addition, the scientists may 
not even be aware of some of their preferences or their impli- 
cations. The process we have described is more like that of 
example-critiquing as described by Viappiani, Faltings, and 
Pu (2006), in which examples (plans) are presented, and crit- 
icisms lead to the expression of additional preferences. In 
our case the preferences are often very complex. The scien- 
tists generally don’t select a specific plan or directly indicate 
a preference for one plan over another - instead, they often 
zero in on parts of plans that they want preserved or changed, 
and provide additional constraints that would enforce these 
preferences. In addition, these preferences tend to be heavily 
dependent on the the current state and resources available, as 
well as the set of goals being considered. These preferences 
also tend to evolve with the process of scientific discovery. 

Carving off Pieces 

The overall problem of building an integrated planning sys- 
tem that addresses all of the issues I raised in the preceding 
sections is quite daunting. Fortunately, there are some sepa- 
rable components and issues that I believe can be addressed 
by the research community: 

• Oversubscription under resource constraints. It’s time to 
move beyond the simple net-benefit model and deal with 
the problem of generating plans for oversubscribed prob- 
lems where actions use resources, and there are limits on 
the resources available. Benton, Do, and Kambhampati 
(2009) have developed effective heuristics for net-benefit 
problems, but it is not obvious how to extend these heuris- 
tics to this more general class of problems. 

• Temporal planning with time windows, time constraints, 
and temporal preferences. While this can be done now, 
there is little indication that current heuristics will be ef- 
fective on larger scale problems. What is needed here is 
the development of more effective search strategies and 
heuristics for these problems. 

• Planning under time uncertainty. Here, what is needed is 
a practical, computationally tractable approach that wor- 
ries less about constructing complete and optimal poli- 
cies, and more about simple, partial policies that do things 
like introducing slack in important places, and making 
sure that the resulting plan will not result in a dead end 
with poor reward. Some early forays in this direction 
are Musliner, Durfee, and Shin (1993), Dearden et al. 
(2003), Gough, Fox, and Long (2004), and Foss, Onder, 
and Smith (2007). 


• Plan revision under constraints. We need simple ways 
of expressing constraints on how a planner should be al- 
lowed to revise a plan (keep this subset of activities but 
do A after 4pm, and make sure you do B before C), and 
good search techniques for producing plans that satisfy 
those constraints. Being able to place constraints on the 
nature of a plan could be seen as a special case of a very 
old idea - McCarthy’s Advice Taker (McCarthy 1990). 

• Producing multiple, qualitatively different plans. While 
there has been some preliminary work in this area (Tate, 
Dalton, and Levine 1998; Myers and Lee 1999), we need 
to take a deeper look at this problem. The root of the prob- 
lem is that the planner doesn’t initially have a complete 
model of the value of different goals or the costs (resource 
usage) of different actions. A real solution to this problem 
needs to explicitly consider uncertainty in the valuation of 
goals and uncertainty in resource usage of actions. A crit- 
ical part of what it means for two plans to be qualitatively 
different is for those plans to make different assumptions 
about goal rewards, and action durations or resource us- 
age. One possible source of inspiration here is work on 
presenting distinct solutions to users for purposes of pref- 
erence elicitation (e.g. Viappiani, Faltings, and Pu 2006). 

• Plan explanation. Literature on the problem of plan expla- 
nation appears to be surprisingly sparse. Questions like: 

Why is activity A in the plan ? 

can be answered relatively easily by elucidating the causal 
structure of the plan, thereby identifying what conditions 
the action achieves that are needed to support other ac- 
tions and achieve desired goals. Questions like: 

Why is this action done before that one ? 

require a deeper analysis of the relationship between two 
activities, and the constraints that govern their relative 
positions in the plan. Bresina and Morris (2006) have 
done some work on explaining temporal inconsistencies 
in plans. Hypothetical questions, such as: 

What would happen if I delayed this action until 
4pm? 

would seem to require modifying the plan and simulating 
it to determine what parts of the plan still work, and what 
parts will now violate time or resource constraints. A rea- 
sonable answer to this type of question might be some- 
thing like: 

Delaying this action until 4pm would mean delaying 
actions A 4 A 5 and A 6 , so the goal of photographing 
Rockl3 could no longer be completed while it is in 
direct sunlight, violating preference P^. 

Questions, such as: 

Why wasn't this resource used instead of that one? 

might require reinvoking the planner with additional con- 
straints, and comparing the resulting plan with the orig- 
inal to determine what is achieved by the two different 
plans, as well as how they differ structurally. Finally, 
purely negative questions, such as: 



Why didn ’t you satisfy this preference ? 

also seem to require replanning with additional con- 
straints - in this case enforcing the preference. An answer 
to this type of question would again seem to require com- 
parison of the new plan with the previous plan. Consider- 
ing this problem in the context of planning appears to give 
us the ability to actually answer such tougher hypothetical 
or counterfactual questions, which goes well beyond cur- 
rent work on inferential question answering in other areas 
of AI. 

Conclusions 

Activity planning for the MER rovers presents many tech- 
nical challenges, including consideration of time, concur- 
rency, resources, preferences, and uncertainty. These have 
all been addressed by the research community to varying 
degrees, but significant technical hurdles still remain. The 
integration of these techniques into a single planning engine 
also remains largely unaddressed. In addition, I have argued 
that there is a deeper set of issues that needs to be addressed 
- namely the integration of planning into an iterative process 
that begins before the goals, objectives, and preferences are 
fully defined. This has a number of technical implications 
for planning, including the need to more naturally specify 
and utilize constraints on the planning process, the need to 
generate multiple qualitatively different plans, and the need 
to provide deep explanation of planning decisions. Although 
I introduced these challenges in the context of planning for 
Mars Rovers, the process and issues I’ve outlined are quite 
typical of the planning that goes on in many complex sci- 
ence missions. In particular, similar processes take place in 
planning for crew activities aboard the International Space 
Station (ISS). For the ISS, there is the additional complica- 
tion that the resulting plans will be executed by astronauts, 
rather than robots. The plans must therefore be easily un- 
derstandable by the astronauts, who may want to ask their 
own questions. As smart executives, astronauts don’t readily 
tolerate stupidity or obvious inefficiency in the plans. In ad- 
dition, they may take the liberty of reordering actions, inter- 
leaving tasks, collaboration, or substituting resources. This 
means that any run-time replanning must be able to model 
and take these deviations into account as well. 
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